Bracken: estimating species abundance in metagenomics data
نویسندگان
چکیده
Metagenomic experiments attempt to characterize microbial communities using high-throughput DNA sequencing. Identification of the microorganisms in a sample provides information about the genetic profile, population structure, and role of microorganisms within an environment. Until recently, most metagenomics studies focused on high-level characterization at the level of phyla, or alternatively sequenced the 16S ribosomal RNAgene that is present in bacterial species. As the cost of sequencing has fallen, though,metagenomics experiments have increasingly used unbiased shotgun sequencing to capture all the organisms in a sample. This approach requires a method for estimating abundance directly from the raw read data. Here we describe a fast, accurate new method that computes the abundance at the species level using the reads collected in a metagenomics experiment. Bracken (Bayesian Reestimation of Abundance after Classification with KrakEN) uses the taxonomic assignments made by Kraken, a very fast read-level classifier, along with information about the genomes themselves to estimate abundance at the species level, the genus level, or above. We demonstrate that Bracken can produce accurate speciesand genus-level abundance estimates even when a sample contains multiple near-identical species. Subjects Bioinformatics, Computational Biology
منابع مشابه
Estimating the habitat suitability of the genus Alosa in the Caspian Sea using the PATREC method and presence data
In many habitat evaluation methods, the abundance data are used. Such data are not available for many species. However, there is some website that provides the presence data of species that are based on the studies made. The present study used the PATREC method to estimate the habitat suitability of the Caspian Sea for the genus Alosa. The PATREC method needs abundance data to calculate the pri...
متن کاملCan models of presence-absence be used to scale abundance? Two case studies considering extremes in life history
Understanding patterns of species occurrence and abundance is a central theme of ecology, natural resource management, and conservation. Although occurrence models have been widely used for describing species distribution, particularly for rare species, abundance models are less common, despite greater information for conservation and management. Because presence-absence data are easier and les...
متن کاملmmnet: Metagenomic analysis of microbiome metabolic network
2 Analysis Pipeline: from raw Metagenomic Sequence Data to Metabolic Network Analysis 2 2.1 Prepare metagenomic sequence data . . . . . . . . . . . . . . . . . . 3 2.2 Annotation of Metagenomic Sequence Reads . . . . . . . . . . . . . 3 2.3 Estimating the abundance of enzymatic genes . . . . . . . . . . . . . 5 2.4 Building reference metabolic dataset . . . . . . . . . . . . . . . . . . 7 2.5 C...
متن کاملBetter primer design for metagenomics applications by increasing taxonomic distinguishability
Current methods of understanding microbiome composition and structure rely on accurately estimating the number of distinct species and their relative abundance. Most of these methods require an efficient PCR whose forward and reverse primers bind well to the same, large number of identifiable species, and produce amplicons that are unique. It is therefore not surprising that currently used univ...
متن کاملMetaCluster 5.0: a two-round binning approach for metagenomic data for low-abundance species in a noisy sample
MOTIVATION Metagenomic binning remains an important topic in metagenomic analysis. Existing unsupervised binning methods for next-generation sequencing (NGS) reads do not perform well on (i) samples with low-abundance species or (ii) samples (even with high abundance) when there are many extremely low-abundance species. These two problems are common for real metagenomic datasets. Binning method...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- PeerJ Computer Science
دوره 3 شماره
صفحات -
تاریخ انتشار 2017